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Abstract — In this paper, an analysis of the undetected error 
probability of ensembles of m x n binary matrices is presented. 
The ensemble called the Bernoulli ensemble whose members are 
considered as matrices generated from i.i.d. Bernoulli source is 
mainly considered here. The main contributions of this work are 
(i) derivation of the error exponent of the average undetected 
error probability and (ii) closed form expressions for the variance 
of the undetected error probability. It is shown that the behavior 
of the exponent for a sparse ensemble is somewhat different from 
that for a dense ensemble. Furthermore, as a byproduct of the 
proof of the variance formula, simple covariance formula of the 
weight distribution is derived. 

I. Introduction 

Random coding is an extremely powerful technique to show 
the existence of a code satisfying certain properties. It has 
been used for proving the direct part (achievability) of many 
types of coding theorems. Recently, the idea of random coding 
has also come to be regarded as important from a practical 
point of view. An LDPC (Low-density parity-check) code can 
be constructed by choosing a parity check matrix from an 
ensemble of sparse matrices. Thus, there is a growing interest 
in randomly generated codes. 

One of the main difficulties associated with the use of 
randomly generated codes is the difficulty in evaluating the 
properties or performance of such codes. For example, it is 
difficult to evaluate minimum distance, weight distribution, 
ML decoding performance, etc. for these codes. To overcome 
this problem, we can take a probabilistic approach. In such an 
approach, we consider an ensemble of parity check matrices: 
i.e., probability is assigned to each matrix in the ensemble. 
A property of a matrix (e.g., minimum distance, weight 
distributions) can then be regarded as a random variable. It 
is natural to consider statistics of the random variable such 
as mean, variance, higher moments and covariance. In some 
cases, we can show that a property is strongly concentrated 
around its expectation. Such a concentration result justifies the 
use of the probabilistic approach. 

Recent advances in the analysis of the average weight 
distributions of LDPC codes, such as those described by Litsyn 
and Shevelev [4] [5], Burshtein and Miller [6], Richardson 
and Urbanke [9], show that the probabilistic approach is a 
useful technique for investigating typical properties of codes 
and matrices, which are not easy to obtain. Furthermore, the 
second moment analysis of the weight distribution of LDPC 
codes [7] [8] can be utilized to prove concentration results for 
weight distributions. 
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The evaluation of the error detection probability of a given 
code (or given parity check matrix) is a classical problem in 
coding theory [2], [3] and some results on this topic have 
been derived from the view point of a probabilistic approach. 
For example, for a linear code ensemble the inequality, Pjj < 
2~ m has long been known where Pjj is the undetected error 
probability and m is the number of rows of a parity check 
matrix. Since the undetected error probability can be expressed 
as a linear combination of the weight distribution of a code, 
there is a natural connection between the expectation of the 
weight distribution and the expectation of the undetected error 
probability. 

In this paper, an analysis of the undetected error probability 
of ensembles of binary matrices of size mxnis presented. 
An error detection scheme is a crucial part of a feedback error 
correction scheme such as ARQ(Automatic Repeat reQuest). 
Detailed knowledge of the error detection performance of a 
matrix ensemble would be useful for assessing the perfor- 
mance of a feedback error correction scheme. 

II. Average undetected error probability 
A. Notation 

For a given m x n(m, n > 1) binary parity check matrix 
H, let C{H) be the binary linear code of length n defined by 
H, namely, C(H) = {x E F 2 ™ : Hx l = O" 1 } where F 2 is 
the Galois field with two elements {0,1} (the addition over 
F2 is denoted by ©). The notation m denotes the zero vector 
of length m. In this paper, a boldface letter, such as x for 
example, denotes a binary row vector. 

Throughout the paper, a binary symmetric channel (BSC) 
with crossover probability e (0 < e < 1/2) is assumed. 
We assume the conventional scenario for error detection: A 
transmitter sends a codeword x e C(H) to a receiver via 
a BSC with crossover probability e. The receiver obtains a 
received word y = x © e, where e denotes an error vector. 
The receiver firstly computes the syndrome s = Hy f and then 
checks whether s — m holds or not. 

An undetected error event occurs when He* — m and 
e ^ m . This means that the error vector e £ C(e ^ 0") 
causes an undetected error event. Thus, the undetected error 
probability Pjj(H) can be expressed as 

Pu(H)= e w(e) (l -e) n - w(e) (1) 

eec(H),e^o m 

where w(x) denotes the Hamming weight of vector x. The 
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above equation can be rewritten as 



w — l 



where A w (H) is defined by 



A W {H) 



A 



I[Hx' = 0' 



(2) 



(3) 



The set {A W (H)}^ U=Q is usually called the weight distribution 
of C{H). The notation Z^ n,w > denotes the set of n-tuples with 
weight w. The notation I[condition] is the indicator function 
such that I[condition] = 1 if condition is true; otherwise, it 
evaluates to 0. 

Suppose that Q is a set of binary mxn matrices (m, n > 1). 
Note that Q may contain some matrices with all elements 
identical. Such matrices should be distinguished as distinct 
matrices. A probability P{H) is associated with each matrix 
H in Q. Thus, Q can be considered as an ensemble of binary 
matrices. Let f(H) be a real- valued function which depends 
on H 6 Q. The expectation of f(H) with respect to the 
ensemble Q is defined by 



£ 6 [/(#)]= 



(4) 



Heg 



The average weight distribution of a given ensemble Q is given 
by Eg[A w (H)]. This quantity is very useful for analyzing the 
performance of binary linear codes, including analysis of the 
undetected error probability. 

B. Bernoulli ensemble 

In this paper, we will focus on a parameterized ensemble 
B m ,n,k which is called the Bernoulli ensemble because the 
Bernoulli ensemble is amenable to ensemble analysis. The 
Bernoulli ensemble B m , n ,k contains all the binary mxn 
matrices (m, n > 1), whose elements are regarded as i.i.d. 
binary random variables such that an element takes the value 
1 with probability p = k/n. The parameter fc(0 < k < n/2) 
is a positive real number which represents the average number 
of ones for each row. In other words, a matrix H £ £> m ,n,fc 
can be considered as an output from the Bernoulli source such 
that symbol 1 occurs with probability p. 

From the above definition, it is clear that a matrix H G 
B m ,n,k is associated with the probability 



P(H)=p l " { - H \l-p) mn - a{H \ 



(5) 



where w{H) is the number of ones in H (i.e., Hamming 
weight of H). The average weight distribution of the Bernoulli 
ensemble is given by 



En 



l + z" 



(6) 



A 



for w £ [0, n] where z — 1 — 2p. The notation [a, b] denotes 
the set of consecutive integers from a to b. The average weight 
distribution of this ensemble was first discussed by Litsyn and 
Shevelev [4]. 



If A: is a constant (i.e., not a function of this ensemble can 
be considered as an ensemble of sparse matrices. In the spacial 
case where k = n/2, equal probability l/2 mn is assigned 
to every matrix in the Bernoulli ensemble. As a simplified 
notation, we will denote lZ m ,n = B m ,n,n/2> where TZ mj7l is 
called the random ensemble. Since a typical instance of Tt m ,n 
contains Q(mn) ones, the ensemble can be regarded as an 
ensemble of dense matrices. 

C. Average undetected error probability of an ensemble 

For a given m x n matrix H , the evaluation of the undetected 
error probability Pjj(H) is in general computationally difficult 
because we need to know the weight distribution of C(H) 
for such evaluation. On the other hand, in some cases, we 
can evaluate the average of Pjj{H) for a given ensemble. 
Such an average probability is useful for the estimation of 
the undetected error probability of a matrix which belongs to 
the ensemble. 

Taking the ensemble average of the undetected error prob- 
ability over a given ensemble Q, we have 



Eg[Pu{H)] = Eg 



A w {H)e w {\ 



Y,E g [A w {H)]e w {l-ey 



(7) 



w— 1 



In the above equations, H can be regarded as a random 
variable. From this equation, it is evident that the average 
of Pjj(H) can be evaluated if we know the average weight 
distribution of the ensemble. For example, in the case of 
the random ensemble H m ,n, the average undetected error 
probability has a simple closed form. 

Lemma 1: The average undetected error probability of the 
random ensemble 7Z m ,n is given by 



E nm jPu(H)] = 2-™(l -(!-£)")■ 
(Proof) By using ©, we have 

n 

En m jPu(H)} = Y,E nm JA w (H)]e w (l-ey 



(8) 



£2- 



e w (l-e) n - 



= 2-" l (l - (1 -e) n ). (9) 
The second equality is based on the well known result [1]: 

En m JA w (H)]=2- m (™). (10) 

The last equality is due to the binomial theorem. q 

D. Error exponent of undetected error probability 

For a given sequence of (1 — R)n X n matrix ensembles 
(n = 1,2,3,...,), the average undetected error probability 
is usually an exponentially decreasing function of n, where 
R is a real number satisfying < R < 1 (called the design 
rate). Thus, the exponent of the undetected error probability is 
of prime importance in understanding the asymptotic behavior 
of the undetected error probability. 
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1) Definition of error exponent: Let {G n }n>o be a series 
of ensembles such that Q n consists of (1 — R)n x n binary 
matrices. In order to see the asymptotic behavior of the 
undetected error probability of this sequence of ensembles, it 
is reasonable to define the error exponent of undetected error 
probability in the following way: 

Definition 1: The asymptotic error exponent of the average 
undetected error probability for a series of ensembles {Gn}n>o 
is defined by 



Tg n = lim -\0g 2 EgJPu] 
n^oo n 



(11) 



if the limit exists. q 
Henceforth we will not explicitly express the dependence of 
Pu on H, writing instead Pjj to denote Pjj(H) in all cases 
where there is no fear of confusion. 

The following example describes the exponent of the ran- 
dom ensemble. 

Example 1: Consider the series of the random ensembles 
{^n,(i-/?)n}«>o- It is easy to evaluate T n{1 _ R)n 



1 



lim ilog^-^l-a-e)") 



-(1-R). 



(12) 



This equality implies that the average undetected error proba- 
bility of the sequence of random ensembles behaves like 
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^(l-H)>-, 



i(l-R) 



(13) 



if n is sufficiently large. Note that the exponent — (1 — R) is 
independent from the crossover probability e. q 

2) Error exponent and asymptotic growth rate: The asymp- 
totic growth rate of the average weight distribution (for 
simplicity henceforth abbreviated as the asymptotic growth 
rate), which is the basis of the derivation of the error exponent, 
is defined as follows. 

Definition 2: Suppose that a series of ensembles {Gn}n>v 
is given. If 

lim - log 2 Eg n [A en ] 

n— >oo Tl 

exists for < £ < 1, then we define the asymptotic growth 
rate f(£) by 



f(i)= lim -\og 2 3g n [A ln \. 

n — >oo fl 

The parameter £ is called the normalized weight. 
From this definition, it is clear that 

3g n [A tn ]=2 n ^^\ 



(14) 
□ 



where the notation o(l) denotes terms which converge to in 
the limit as n goes to infinity. The asymptotic growth rate of 
some ensembles of binary matrices can be found in [4] [5] [6]. 

The next theorem gives the error exponent of the undetected 
error probability for a series of ensembles {Q n }n>o- 

Theorem 1: The error exponent of {Qn} n >vi is given by 

Tg n = sup [/(*)+ nog 2 e+(l-*)log 2 (l-e)], (16) 

0<^<l 



where f(£) is the asymptotic growth rate of {£?„}„>o- 
(Proof) Based on the definition of asymptotic growth rate, we 
can rewrite Tg n in the form 

T Gn = lim -log 2 3g n [Pu] 

n— >oo n 

= lim -log 2 J2 Eg n [A w ]e w (l ~ e) n ' w 

w — 1 

i " 

= lim -log 2 Y2 n ^+ K ^ n ^ + ° 

rj. — yrxn n * J 



(i)) 



w=X 



where K(e 1 n, w) is defined by 

K(e, n,w) = — log 2 e + ( 1 J log 2 (l - e). 

n V n ) 

Using a conventional technique for bounding summation, we 
have the following upper bound on Tg n : 



(17) 



Tg n = lim ~log 2 VfW? 



)+K(e,n,w)+o(l)) 



< lim -log 2 nmkx2 n ^ +K ^ n ^+°^ 

n~ >oo fl w — 1 

= lim maxilog 2 2"^^)+^( e '"' IO )+°( 1 )) 



n— >oo w — 1 fl 



= lim max 

n— »oo w — 1 



= sup [/W+nog 2 e + (l-^)log 2 (l-e)]. (18) 

0<£<1 

We can also show that Tg n is greater than or equal to the 
right-hand side of the above inequality ( TT~8b in a similar 
manner. This means that the right-hand side of the inequality 
is asymptotically tight. 

□ 

The next example discusses the case of the random ensem- 
ble. 

Example 2: Let us again consider the series of the random 
ensembles given by {Hh-m n ,n}n>o- These ensembles have 
the asymptotic growth rate f(£) = h(£) — (1 — R), where the 
function h{x) is the binary entropy function defined by 

h(x) = -x\og 2 x- (1 - x)log 2 (l - x). (19) 
In this case, by using Theorem Q] we have 

Tn {1 _ R)n = sup [^)-(l-ii)+nog 2 e+(l-^log 2 (l-e)]. 

' 0<£<1 

(20) 

Let 

D^ = i\og 2 (-\+{l-£)\og 2 (\— -). (21) 



1 



(15) By using Dg^ e , we can rewrite d20i > as 



Tt(l-R)r. 



sup [-(1-R)-D £ J. (22) 

0<£<1 



Since Di e can be considered as the Kullback-Libler diver- 
gence between two probability distributions (e, 1 — e) and 
(t, 1 — I), Di yC is always non-negative and D(, t = holds 
if and only if I = e. Thus, we obtain 

sup [-(l-R)-D iie ] = -(l-R), (23) 

o<e<i 
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which is identical to the exponent obtained in expression ( fTSl i. 

Let gi rnd \i) = h{i) - (1 - R) + £log 2 e + (1 - 
€) log 2 (l — e). Figure Q]displays the behavior of g-[ nt ^ (£) when 
R = 0.5. This figure confirms the result that the maximum 



(sup 0<£<1 gi rnd \£) = —0.5) is attained at £ = e. 



□ 
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Fig. 1. The curves of gt(l) for random ensembles with R = 0.5. 



E. Error exponent of the Bernoulli ensemble with constant k 

The asymptotic growth rate of the Bernoulli ensemble 
it with a constant k and design rate R is given by 



B 



f(£) = h(e) + (l-R)log 2 



1 



-2k( 



(24) 



This formula is presented in [4]. The error exponent of this 
ensemble shows a different behavior from that for random 
ensembles. 

Example 3: Consider the Bernoulli ensemble with parame- 
ters R = 0.5 and k = 20. Let 



(spin) 



A 



H(£) + (1 - R)log 2 



£log 2 e +(!-£) log 2 (l-e). 



1 



-2kl 



Figure [2] includes the curves of 

(rnd) , 



(spm) 



(25) 



{£) where 



0.1,0.2,0.4. In contrast to gY '(£) of a random ensemble, 
we can see that g l f pm ^ (t) is not a concave function. The shape 
of the curve of gl spm ' (f) depends on the crossover probability 
e. For large e, g e (£) takes its largest value around I = e. On 
the other hand, for small e, gf P {£) has the supremum at 
e = 0. 

Figure[3]presents the error exponent of Bernoulli ensembles 
with parameters R — 0.3,0.5,0.7,0.9 and k = 20. As an 
example, consider the exponent for R = 0.5. In the regime 
where e is smaller than (around) 0.3, the error exponent is a 
monotonically decreasing function of e. 

The examples suggest that a sparse ensemble has less 
powerful error detection performance than that of a dense 
ensemble (such as the random ensemble) in terms of the error 
exponent. However, if the crossover probability is sufficiently 
large, the difference in exponent of sparse and dense ensembles 
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Fig. 2. The curves of g( spm ^ for Bernoulli ensembles. 




Fig. 3. 



0.2 0.3 
Crossover probability £ 

s of fc correspond to the parameters Ft = 0.3, 0.5, 0.7, 0.9 and k = 20. are presented. 
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is negligible. For example, the exponent of the Bernoulli 
ensemble in Fig. [3] is almost equal to that of the random 
ensemble when e is larger than (around) 0.3. 

The above properties of the error exponents of the Bernoulli 
ensembles can be explained with reference to their average 
weight distributions (or asymptotic growth rate). Figure [4] 
displays the asymptotic growth rates of a random ensemble 
and a Bernoulli ensemble. 

The weight of typical error vectors is very close to en when 
n is sufficiently large. For a large value of e, such as e = 0.4, 
the average weight distribution around w = 0.4n, namely 
i?g[j4o.4 n ], dominates the undetected error probability. In such 
a range, the difference in the average weight distributions 
corresponding to the random and the Bernoulli ensembles is 
small. On the other hand, if the crossover probability is small, 
weight distributions of low weight become the most influential 
parameter. The difference in the average weight distributions 
of small weight results in a difference in the error exponent. 

Note that the time complexity of the error detection op- 
eration (multiplication of received vector and a parity check 
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Fig. 4. Asymptotic growth rate of a random ensemble and a Bernoulli 
ensemble. 



matrix) is 0(n 2 )-time for a typical instance of a random en- 
semble, and is 0(n)-time for a typical instance of a Bernoulli 
ensemble with constant k. A sparse matrix offers almost same 
error detection performance of a dense matrix with linear time 
complexity if e is sufficiently large. 

III. Variance of undetected error probability 

In the previous section, we have seen that the average weight 
distribution plays an important role in the derivation of average 
undetected error probability. Similarly, we need to examine 
the covariance of weight distribution in order to analyze the 
variance of undetected error probability. 

A. Covariance formula 

The covariance between two real-valued functions /(•)> 
defined on an ensemble Q is given by 



A 



Covg[f,g} = E g [fg}-E g [f]E g [g}. 



(26) 



The next theorem forms the basis of the derivation of the 
variance of the undetected error probability for the Bernoulli 
ensemble. The covariance of the weight distribution for the 
Bernoulli ensemble is given in the following theorem. 

Theorem 2: The covariance of the weight distribution for 
the Bernoulli ensemble B m<n> k is given by 



Cav B m , n , h {A wl , A W2 ) 

A (\ + Z Wi 



x E 

v— max{0 1 ti>i+'UJ2 —n} 



l 



1 + z l 



n \ I w± \ I n — w\ 
WiJ \v J \w 2 - v 



1 



(27) 



(1 + Z™l)(l+Z™2) 

for 1 < wi < W2 < n and 

Cov s m , ra ,fc {A Wl , A W2 ) = Covs m „ fc (A W2 , A Wl ) (28) 

for 1 < w 2 < u>i < n where z = 1 — 2p and p — k /n. 
(Proof) See Appendix. rj 



Remark 1: When k = n/2, B mjly k becomes the random 
ensemble lZ„ hn . We discuss this case here. 

We first assume that 1 < wi < W2 < n. Let p = 1/2 (i.e., 
k = n/2). In such a case, we have z — 1 — 2p — 0. Define L 
by 

/. ( 1 • — ) • (29) 



(30) 



(1 + z u 'i)(l + z W2 ) 
The variable L takes the following values: 



L = 



1, Wi < w 2 

1, Wi = U>2, V < Wi 

2, wi = w 2 , v = w±. 



Substituting z = into equation ( f27l > and using the identity 
j28l . we get 

CovTz mn (A Wl , A W2 ) 

0, 1 < w-l ^ w 2 < n 

2- 2m Q(2 m ~l), l< Wl =w 2 <n. (iL) 

Another proof of this formula is presented in [10]. rj 

B. Variance of undetected error probability 

The variance of the undetected error probability is a straight- 
forward consequence of Theorem [2] 

Corollary 1: The variance of the undetected error probabil- 
ity of the Bernoulli ensemble, Og k is given by 

n n 

a B m . n . k = E I! Cov B m , n , k (A wl ,A W2 ) 



Wi—1 W2 — 1 



x e 



W1+W2 



(1-e) 



2n—w± —W2 



(32) 



(Proof) The variance of the undetected error probability P\j is 
given by 

= E Bm ^ k [P*]-E Bm ^ k [Pu] 2 - (33) 

We first consider the second moment of the undetected error 
probability: 

Y,A w e w {l-e) n - w \ 



E* 



-E, 



\w=l 
n n 



X] E A Wl A W2 e^ +W2 (l-e) 2n -^- w 2 

Ui — 1 W2 — 1 
Wi — 1 W2 = l 

The squared average undetected error probability can be 
expressed as 

r/ » \i 2 

E Bm . n jPu} 2 =E, 



Y,A w e™{l-eY 



= E y^. E B m , n , k [A Wl ]E Bmnk [A W2 ] 

W\— 1 W2 — 1 

x e wt+w 1! ^ 1 _ e ^n-w 1 -w^ (35) 
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Combining these equalities and the covariance of the weight 
distribution, the variance of undetected error probability 
<7g fc is obtained. rj 
Remark 2: The covariance of the weight distribution for a 
given ensemble £> m .„.fc is useful not only for the evaluation of 
the variance of Pjj. Let X be a random variable represented 
by 



(36) 



w=0 



where a(w) is a real-valued function of w. The covariance 
of the weight distribution is required more generally for the 
evaluation of the variance of X, which is given by 

n n 

°x = E E Cov Bm, n ,k(A Wl ,A W2 )a(wi)a(w 2 )- (37) 

Wi—O W2=0 

A specialized version (the case where X = Pxf) of this 
equation has been derived in the previous corollary. j-j 
Example 4: Let us consider the Bernoulli ensemble with 
m = l,n = 2 and k = 1/2 (p = 1/4). Table U displays the 
weight distributions and undetected error probabilities for the 
4 matrices in <t?i 2,i/2- 

TABLE I 

Weight distributions and undetected error probabilities 



H 


C(H) 


Ai(H) 


A 2 {H) 


Pu(H) 


(0,0) 


{00,01,10,11} 


2 


1 


2e-e 2 


(0,1) 


{00, 10} 


1 





e-e 2 


(1,0) 


{00,01} 


1 





e-e 2 


(1,1) 


{00,11} 





1 


e 2 



From the definition of a Bernoulli ensemble, the follow- 
ing probability is assigned to each matrix: P((0, 0)) = 
9/16, P((0, 1)) = 3/16, P((l, 0)) = 3/16, P((l, 1)) = 1/16. 
Combining the undetected error probabilities presented in 
Table U and the above probability assignment, we immediately 
have the first and second moments: 



E B 1 , 2 , 1/2 [Pu] - g 



7 



3 

21 



h 3 + e 4 . 



(38) 
(39) 



From these moments, the variance can be derived: 



CT S lia ,i /2 = E B w/a [Pu] ~ E B 1 ,2,i/2 [ P u] 2 

8 8 64 



(40) 



We can also consider another route to derive the variance 
by using Corollary Q] The co variances of Si 2 1/2 are given 
by 

Cov Bw/2 (l,l) = 3/8 (41) 
Cov Bl21/2 (l,2) = Cov Sl21/2 (2,1) = 3/16 (42) 
Cov Hl 2 1/2 (2,2) = 15/64. (43) 



From Corollary Q] we obtain the variance 

2 2 

= (3/8)£ 2 (l- e ) 2 + (3/16) e 3 (l- e ) 
+ (3/16)e 3 (l - e) + (15/64)e 4 
= ^ e 2_3 3 15 4 
8 8 + 64 ' 
that is identical to expression d40l ). rj 
In the case of k — n/2 (i.e. the case of a random ensemble), 
we can derive a closed form expression for the variance. 

Corollary 2: For the random ensemble lZ m , n , the variance 
of the undetected error probability Pjj is given by 

a\ = (1 - T m )T m ((e 2 + (1 - e) 2 )" - (1 - e) 2n ) . 

(44) 

(Proof) The variance of undetected error probability a\ 
can be obtained in the following way: 



2 



= En m ,„[Pu] ~ En m APuf 

n n 

= E E Cov ^" [A Wl ,A W2 ]e w ^(l - e) 2 '— 

'W 1—1 U'2 — l 



= ^(l-2- m )2- 



e 2u, (l-e) 



2m-2mi 



The second equality is due to Corollary Q] The last equality 
are due to Eq. ( f3Tb . We can further simplify the expression 
using the binomial theorem: 



2 



= (1-2 -vi •"^(;]f 



2\n— 



- m )2" m (l - e) 2 " 

2^n 



(1-2 
(1-2 

((e 2 + (l-e) 2 )"-(l-e) 2 ") 



(45) 



The last equality is the claim of the theorem. rj 
The next example facilitates an understanding of how the 
average and the variance of Pjj behave. 

Example 5: We consider the random ensemble with m — 
20, n = 40, and the Bernoulli ensemble with m = 20, n = 
40, k — 5 (labeled "Sparse" in Fig. |5). Figure [3] depicts the 
average undetected error probabilities of the two ensembles. 
It can be observed that the average undetected error proba- 
bility of the random ensemble mono tonic ally decreases as e 
decreases. In contrast, the curve for the Bernoulli ensemble has 
a peak around e ~ 0.025. Figure [6] shows the variance of Pjj 
for the above two ensembles. The two curves have a similar 
shape, but the variance of the sparse ensemble is always larger 
than that of the random ensemble. rj 

C. Asymptotic behavior 

We here discuss the asymptotic behavior of the covariance 
of the weight distribution and the variance of Pu for the 
Bernoulli ensemble. The following corollary explains the 
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Fig. 5. Average undetected error probabilities. 
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Fig. 6. Variance of undetected error probability. 



asymptotic behavior of the covariance of the weight distri- 
bution. 

Corollary 3: Let the asymptotic growth rate of the covari- 
ance of the weigh distribution of the Bernoulli ensemble be 
T{e 1 ,£ 2 ) defined by 

T(h,£ 2 ) = Urn -log 2 Cov B k (A hn ,A i2n ) (46) 

for < £i,£ 2 < 1 and < R < 1. The asymptotic growth 
rate is given by 

T{£ u £ 2 ) = sup Q(v) (47) 

max{0,(i+f 2 -l)<^<fl 

for < £i < £ 2 < 1 and 

T(t 1 ,b)=T(t 2 ,t 1 ) (48) 
for < £ 2 < £\ < 1 where Q(v) is defined by 



Q{v) = -2(1 - i?) + h(£i) 



The function a(/x, ;/) is defined by 

t h (tZ^) + Ml0g 2 ( e - 2fe (^+^-2,) _ e -2*(*i+fa)) 

+ (l-i?-M)log 2 ((l + e- 2fe£l )(l + e" 2fe£2 )). (50) 

(Proof) We here rewrite the covariance formula ( [27] ) into 
asymptotic form. By using the Binomial theorem, we have 



(1 + z™i)(l + z W2 ) 



E 



^ \ i J \ (1 + z w i)(l + z w *) 



(51) 



By using this identity, the covariance in d27l > can be rewritten 
in the following form: 



Cov 8 m ,„, l0 (^tui,^u. 2 ) 

2 -2m ^ ^ " \ / "'l \ (" - "'l 

max{0,uJi +itJ2 — 

where is defined by 



W\J \ V J \w 2 — V 



e. 



e - El" 



x ((l + z lUl )(l + z W2 )) r 



(52) 



Letting u>i = £in,w 2 = £ 2 n,v — i>n,m = (1 — i?)rt, we 
have 

lim - log 2 2~ 2m = -2(1 - R) (53) 

n — >oo 7X 



and 



lim — log 2 

n — >oo fi 



wi\ fn — w\ 



WlJ \ v J \w 2 — v 



(54) 



If k is a constant and < I < 1, then, making use of the 
identity [4] 



lim 1-2 

n— >oo \ \ n 



lim 

n — >oo 
-2fc£ 



we get 



lim — log 2 O = sup a(/i). 



(55) 



(56) 



Combining these asymptotic expressions, the claim of the 
corollary is derived. r-j 

The following corollary gives the asymptotic growth rate of 
the variance of the undetected error probability. 

Corollary 4: The asymptotic growth rate of the variance of 
the undetected error is given by 

lim -log 2 cr^ = sup sup S{li,h), (57) 

n^oon »,Ci-B)n,» <£i<10<£ 2 <1 
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where S(ti,£2) is given by 

S(h,l 2 ) = (h+i2)\og 2 e+(2-£ 1 -i 2 )\og 2 (l-e) 
+ T{£ 1 ,e 2 ). (58) 

(Proof) It is evident that 
1 



n 



lim - log 2 ( e ^"+^«(i _ f )2»-«i»-*2») 
(4 + £ 2 ) log 2 e + (2 - h - i 2 ) log 2 (l - e) 



(59) 



holds. Combining this identity and Corollaries Q] and [3] we 
immediately have the claim of the corollary. q 

IV. Appendix 

1 ) Preparation of the proof: The second moment of the 
weight distribution for a given ensemble Q is given by 

Eg [A Wl A W2 ] 

= E s Yl Yl ^ Hxt = ^vwv* = o m ] 

for < w\ , w 2 < n. Since 



I[Hx l = O^IlHy 1 
we have 

Eg [A Wl A W2 ] 



m }= IiHx 1 =O m ,Hy t =0 m ], 



Eg 



= J2 J2 E <3 [^ Hxt = ° ro > -fry* = • (so) 

We here encounter a problem of evaluating probability of 
occurrence of both Hx 1 = m and Hy f = m . In preparation 
to solve this problem, we will introduce some notation: 

Definition 3: For a given pair (x,y) G Z( n > w ^ x Z (7hW2 \ 
the index sets Ii, I 2 , 1$, I4 are defined as follows: 



A 
A 



{fee [l,n] 



7 2 = {fcefl.n] 

I 3 = [ke[l,n] 

h = G [l,n] 

where a; = (xi, x 2 , . . . , x n ) an 
regions are illustrated in Fig(7] The size of each index set is de- 
noted by i k = #Jfe(fe = 1, 2,3,4). Let h = {hi,h 2 , ...,h n ) 
be a binary rt-tuple. The partial weight of h corresponding to 
an index set = 1, 2, 3, 4) is denoted by Wk(h), namely 



Xk 


= l,Vk = 


0} 


(61) 


Xk 


= 1,2/fc = 


1} 


(62) 


Xk 


= 0, y k = 


1} 


(63) 


Xk 


= 0, yit = 


0}, 


(64) 


y 


= (2/1,2/2, 


■■■,Vn) 


These 



w k (h) = #{j e I k ■ hj = 1}. 



(65) 



□ 



Since the index sets are mutually exclusive, the equation 
i\ + i 2 + 13 + «4 = n holds and i 2 can take an integer value 
in the following range: 



c{u>i 



w 2 — n, 0} < i 2 < min{wi, w 2 }. (66) 



The size of each index set can be expressed as i\ = w\ 

h = w 2 - i 2 , i A = n— (wi + w 2 - i 2 ). 



X 

y 



h h 1. 



w\ ones 
W2 ones 



I 1 represents ones. 

Fig. 7. The 4 regions Ii t Ia, ^3,^4. 



A. Proof of Lemma\2\(Covariance of the Bernoulli ensemble) 

Let x G Z^^^ and y G Z^^ be binary vectors 
satisfying w\ < w 2 . In this proof, we first prove the following 
equality: 



^B n , m . fc [/[ffx t = o,fftf* = o]] 



2 



(67) 



where v = #(Supp(a;) n Supp(cc)), z — 1 — 2p and p= k/n. 
The support set Supp(t> ) is defined by 



Supp(v) = {i E [l,n] : v t ± 0}, 



(68) 



where v = (v 1 ,v 2} . . . ,v n ). 

We need to consider the following three cases: Case (i): 

< i 2 < wi (i.e., the intersection of Supp(a;) and Supp(y) 
is not empty but Supp(y) does not include Supp(a;)), Case 
(ii): i 2 = (i.e., the intersection of Supp(a;) and Supp(y) is 
empty), Case (iii): i 2 — wi (i.e., Supp(y) includes Supp(a;)). 

We first study Case (i). Suppose that a binary n-tuple h is 
generated from a Bernoulli source with Pr[hi = 1] = p(i G 
[l,n]). Recall that p is defined by p = k/n. In this case, 
hx f = 0,hy f = holds if and only if Wi(h) is even for 

1 = 1, 2, 3 or Wi(h) is odd for i = 1,2, 3. 

It is well known that a binary vector (ti,t 2 , . . . , t u ) gener- 
ated from a Bernoulli source has even weight with probability 
(1 + (1 - 2q) u )/2, where q is the probability that U(i G [1, u]) 
takes 1 [1]. The probability that (ti,t 2 , . . . ,t u ) has an odd 
weight is given by (1 — (1 — 2q) u )/2. For example, the 
probability that w\(h) becomes even is (1 + z Wl )/2 where 
z = 1 - 2p. 

Based on the above argument, we can write the probability 
Prlhx* = 0, hy l = 0] as a function of z: 

Pr[hx l = 0,hy f = 0] 

(1 + z ll )(l + z l2 )(l + z 13 ) + (1 - z ll )(l - z l2 )(l - z 13 ) 



z 



(69) 



A 



where v = i 2 . 

We next consider Case (ii). For this case, v — i 2 is assumed 
to be zero. In this case, hx l — 0, hy l = holds if and only 
if both 101 (ft.) and W3 (h) are even. The probability that h 
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satisfies hx 1 = and hy l — under the condition i% = is 
given by 

Prlhx* = 0, hy t = 0] 

l + z ll \ fl + z 1 - 



2 J \ 2 

l + z u,1 \ /l + z™ 2 



2 / V 2 



(70) 



Finally we consider Case (iii). Assume the case v = 12 = 
Wi,x ^ y. In this case, /ia;* = 0, /m/' = holds if and only 
if both W2(h) and W3(h) are even. The probability Pr\hx l = 
0, hy l = 0] under the condition v = W\,x ^ y is thus given 
by 

Pr[fca:* = 0, %* = 0] 



2 J V 2 



2 1 ^DJi -I- z W2 4- 2 M >1+ U '2— 2d 



(71) 



We next consider the case x = y. For this case, we also have 

Prlhx 1 = 0, hy l = 0] 
1 + X 101 



(72) 



In summary, for any cases (Cases (i), (ii), (iii)), 

Pr[hx* = 0, hy* = 0] = + + + 

4 (73) 

holds. Since the rows of parity check matrices in B n , m ,k can 
be independently chosen, we obtain Eq. d67| i in the following 

way: 

E Bntmtk [/[ite* = o,fly = 0]] 

= PrlHx* = 0, Hy l = 0] 
= Prihx* = 0, hy l = 0} m 

■ 1 1 y W\ I _DJ 2 I _DJi+1D 2 -2d \ m 

+ + / J • (74) 

Combining (l60b and (f67T >. we have 

E B n , m , k [^DJi^DJ 2 ] 

= E E Es^illHx* =0^,^1 = 0™}] 



= E E 

X£Z<- n - m i) yGZ(">™2) 

E 

u— max{0,i^i -\-W2 —n} 



1 + Z Wl + Z W2 + z ^+^2-2v 



n \ iwi \ I n — W\ 
wi)\v J \w 2 - v 



1 + Z Wl + Z W2 + z wi+w 2 -2v 



(75) 



Since 

E Bn . m . k [A w ] 
holds [4], we thus have 

E B n , m , k [ A w\\ E B n . m . k [A 



l + Z v 



(76) 



WiJ \w 2 



W21 

1 + z Wl 



l + z v 



E 

v—m.ax.{0,wi-\-W2 —n} 

l + z wi + z W2 +z wl+W2 



n \ 1 wi \ / n — wi 

WlJ I V J \W2 — v 



E 

v — max {0,wi -\-W2 — n} 



(77) 



The last equality is due to the following combinatorial identity: 

' x / Wi\/n — u>i\ / n\/ n\ 
v J \w 2 — vj \Wi J \w 2 J ' 

(78) 

We are ready to derive the covariance of weight distributions 
for the case wi < W2- Substituting d75l > and ( |77] > into 

Cav B m , n>h (A Wl , A W2 ) 

= E B n . m , k [A Wl A W2 ] - E Bn m k [A Wl ] E Bn m k [A W2 ] , 

we have d27b in the claim part of the Theorem. Since the defi- 
nition of covariance is commutative, Covg m n k (A Wl , A W2 ) = 
Cov B m , n , k (Aw 2 ,A Wl ) holds if wi > w 2 - □ 
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